Outlier Detection under Interval Uncertainty: Algorithmic Solvability and Computational Complexity

نویسندگان

  • Vladik Kreinovich
  • Luc Longpré
  • Praveen Patangay
  • Scott Ferson
  • Lev Ginzburg
چکیده

In many application areas, it is important to detect outliers. The traditional engineering approach to outlier detection is that we start with some “normal” values x1, . . . , xn, compute the sample average E, the sample standard variation σ, and then mark a value x as an outlier if x is outside the k0-sigma interval [E − k0 · σ, E + k0 · σ] (for some pre-selected parameter k0). In real life, we often have only interval ranges [xi, xi] for the normal values x1, . . . , xn. In this case, we only have intervals of possible values for the bounds E− k0 · σ and E + k0 · σ. We can therefore identify outliers as values that are outside all k0-sigma intervals. Once we identify a value as an outlier for a fixed k0, it is also desirable to find out to what degree this value is an outlier, i.e., what is the largest value k0 for which this value is an outlier. In this paper, we analyze the computational complexity of these outlier detection problems, provide efficient algorithms that solve some of these problems (under reasonable conditions), and list related open problems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Outlier Detection under Interval and Fuzzy Uncertainty: Algorithmic Solvability and Computational Complexity

In many application areas, it is important to detect outliers. Traditional engineering approach to outlier detection is that we start with some “normal” values , compute the sample average , the sample standard variation , and then mark a value as an outlier if is outside the -sigma interval (for some pre-selected parameter ). In real life, we often have only interval ranges for the normal valu...

متن کامل

Estimating information amount under uncertainty: algorithmic solvability and computational complexity

Sometimes, we know the probability of different values of the estimation error ∆x def = e x− x, sometimes, we only know the interval of possible values of ∆x, sometimes, we have interval bounds on the cdf of ∆x. To compare different measuring instruments, it is desirable to know which of them brings more information – i.e., it is desirable to gauge the amount of information. For probabilistic u...

متن کامل

Combining Interval, Probabilistic, and Fuzzy Uncertainty: Foundations, Algorithms, Challenges – An Overview

Probabilistic and . . . Interval . . . Why Not Maximum . . . Chip Design: Case . . . General Approach: . . . Interval Approach: . . . Extension of Interval . . . Successes (cont-d) Challenges Problem Main Idea: Use Moments Formulation of the . . . Result Case Study: . . . General Problem Case Study: Detecting . . . Outlier Detection . . . Outlier Detection . . . Fuzzy Uncertainty: In . . . Ackn...

متن کامل

Estimating Information Amount under Interval Uncertainty: Algorithmic Solvability and Computational Complexity

In most real-life situations, we have uncertainty: we do not know the exact state of the world, there are several (n) different states which are consistent with our knowledge. In such situations, it is desirable to gauge how much information we need to gain to determine the actual state of the world. A natural measure of this amount of information is the average number of “yes”-“no” questions t...

متن کامل

Detecting Outliers under Interval Uncertainty: A New Algorithm Based on Constraint Satisfaction

In many application areas, it is important to detect outliers. The traditional engineering approach to outlier detection is that we start with some “normal” values x1, . . . , xn, compute the sample average E, the sample standard deviation σ, and then mark a value x as an outlier if x is outside the k0sigma interval [E − k0 · σ,E + k0 · σ] (for some pre-selected parameter k0). In real life, we ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003